課程大綱

課程資訊

課程名稱	數據分析與流形學習 Data Analysis and Manifold Learning
開課學期	112-1
授課對象	電機資訊學院資料科學碩士學位學程
授課教師	林澤佑
課號	Data5008
課程識別碼	946 U0080
班次
學分	3.0
全/半年	半年
必/選修	選修
上課時間	星期三3,4,5(10:20~13:10)
上課地點	綜402
備註	大學部學生欲加選者，請與老師聯繫並出席第一週課程。限碩士班以上總人數上限：30人

課程簡介影片
核心能力關聯	核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述	流形學習是一類用於識別高維度資料的技術。其基本假設是資料來源於高維度的空間低維度子流形上或是在其附近。流形學習的目的在於探索該種低維度結構並使數據用相對簡單、更具解釋性的形式來表達。常見的的流形學習包含主成分分析、多維尺度法、ISOMAP、t-隨機鄰近嵌入法等等。這些方法主要的差異為對於數據的假設、適用的數據種類與將數據表示為低維度的計算複雜度。在本門課程中，我們將介紹許多流形學習技術中的數學概念、演算法以及這些方法的基本假設和局限性。在學期結束時，學生需要完成一個應用流形學習技術的期末專案。該專案應該包含一種或多種流形學習方法處理並應用於真實世界數據的分析上，並將結果清晰易懂的方式呈現出來。 Manifold learning is a class of techniques used to identify patterns in high-dimensional data. The underlying assumption is that the data lies on or near a lower-dimensional manifold embedded in the high-dimensional space. The goal of manifold learning is to discover this lower-dimensional structure and represent the data in a simpler, more interpretable form. Common manifold learning techniques include principal component analysis (PCA), multidimensional scaling (MDS), ISOMAP, t-SNE, and many others. These methods differ in their assumptions, the types of data they are suitable for, and the complexity of the resulting representation. In this course, we will introduce the mathematical concepts and algorithms used in many manifold learning techniques, along with their underlying assumptions and limitations. At the end of this semester, students are required to complete a final project that utilizes manifold learning techniques. The project should involve preparing and analyzing real-world data using one or more manifold learning methods, and, presenting the results in a clear and informative manner.
課程目標	Manifold learning is a branch of machine learning that focuses on nonlinear dimensionality reduction, which is often applied for data preprocessing in the field of data science. The goal of dimensionality reduction is to represent data of interest in a low-dimensional space, preserving important information and relationships by mapping the data from a high-dimensional space to a lower-dimensional space. In this course, we will provide an overview of some popular manifold learning techniques and demonstrate their implementation in Python. Both theoretical and practical aspects of each technique will be covered in the discussions. 第一週 Preliminary Examination; Basic graph theory and matrices associated with a graph [4] 第二週 Special Matrices and Matrix Eigenvalue Problems 第三週 Introduction to spectral and graph-based methods 第四週 PCA and SVD [8] 第五週 Exam I; Fisher Linear Discriminant [2] 第六週 No class 第七週 Laplacian embedding and spectral clustering [1] 第八週 Multidimensional scaling [3]; Locally Linear Embedding [11] 第九週 ISOMAP [12] 第十週 Exam II; Introduction to kernel method 第十一週 Kernel PCA [10] 第十二週 Diffusion kernels [5] 第十三週 Introduction to manifold reconstruction 第十四週 Local-surface-fitting based methods [6], [7] 第十五週 Final project 第十六週 Final project
課程要求	Note that this is a graduate-level course and math maturity is essential for success in this course. Prerequisite: Calculus, Probability, Linear Algebra, Elementary Differential Geometry, Python Homework: 50% Exam I: 15% Exam II: 15% Project: 20%
預期每週課後學習時數
Office Hours
指定閱讀
參考書目	The lecture will be based on the slides partially from the following reference: [1] Belkin, Mikhail, and Partha Niyogi. "Laplacian eigenmaps for dimensionality reduction and data representation."; Neural Computation 15.6 (2003): 1373-1396. [2] Bishop, Christopher M., and Nasser M. Nasrabadi. Pattern recognition and machine learning. Vol. 4. No. 4. New York: Springer, 2006. [3] Borg, Ingwer, and Patrick JF Groenen. Modern multidimensional scaling: Theory and applications. Springer Science & Business Media, 2005. [4] Chung, Fan RK. Spectral graph theory. Vol. 92. American Mathematical Soc., 1997. [5] De la Porte, J., et al. "An introduction to diffusion maps." Proceedings of the 19th Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa. 2008. [6] Faigenbaum-Golovin, Shira, and David Levin. "Manifold reconstruction and denoising from scattered data in high dimension." Journal of Computational and Applied Mathematics (2022): 114818. [7] Fefferman, Charles, et al. "Reconstruction and interpolation of manifolds. I: The geometric Whitney problem." Foundations of Computational Mathematics 20.5 (2020): 1035-1133. [8] Jolliffe, Ian. "Principal component analysis." Encyclopedia of statistics in behavioral science (2005). [9] Ma, Yunqian, and Yun Fu. Manifold learning theory and applications. Vol. 434. Boca Raton, FL: CRC press, 2012. [10] Mika, Sebastian, et al. "Kernel PCA and de-noising in feature spaces." Advances in neural information processing systems 11 (1998). [11] Saul, Lawrence K., and Sam T. Roweis. "An introduction to locally linear embedding." unpublished. Available at: http://www. cs. toronto. edu/~ roweis/lle/publications. html (2000). [12] Tenenbaum, Joshua B., Vin De Silva, and John C. Langford. "A global geometric framework for nonlinear dimensionality reduction." science 290.5500 (2000): 2319-2323.
評量方式 (僅供參考)

課程進度

週次	日期	單元主題
無資料